On the probability that convex hull of random points contains the origin

We show that for every $K\geq 1$ there is $c>0$ depending only on $K$ with the following property. Let $n>d\geq 1$, and let $X_1,\dots,X_n$ be independent random vectors with i.i.d components having a (possibly discrete) symmetric distribution of unit variance and the subgaussian moment bounded by $K$. Set $p_{n,d}:=1-2^{-n+1}\sum_{k=0}^{d-1}{n-1\choose k}.$ Then \begin{align*} p_{n,d}\leq {\mathbb P}\big\{{\rm conv}\{X_1,\dots,X_n\}\mbox{ contains the origin}\big\} \leq p_{n,d} +2\exp(-cd), \end{align*} and \begin{align*} p_{n,d}-2\exp(-cd)\leq {\mathbb P}\big\{{\rm conv}\{X_1,\dots,X_n\}\mbox{ contains the origin in the interior}\big\} \leq p_{n,d}. \end{align*} We further prove a related result in the context of average-case analysis of linear programs. Let $n\geq d$, let ${\bf 1}$ be an $n$-dimensional vector of all ones, and let $A$ be an $n\times d$ random matrix with i.i.d symmetrically distributed entries of unit variance and subgaussian moment bounded above by $K$. Then for any non-zero non-random cost vector $\mathfrak c$, $$ \big| {\mathbb P}\big\{\mbox{Linear program ``$\max\langle x,{\mathfrak c}\rangle\quad\mbox{subject to }Ax\leq {\bf 1}$'' is bounded}\big\} -p_{n+1,d} \big|\leq 2\exp(-cd). $$ In particular, the result implies that for $n=2d+\omega(\sqrt{d})$, the linear program is bounded with probability $1-o(1)$, and for $n=2d-\omega(\sqrt{d})$ unbounded with probability $1-o(1)$.


Introduction
The problem of computing the probability that the convex hull of a set of random vectors contains the origin has attracted considerable attention; see, in particular, [21,5,17,10,11,19,9,6,7].The following is a classical theorem of Wendel [21] based on combinatorial properties of hyperplane arrangements: Theorem 1.1 ([21]).Let n > d, and let X 1 , X 2 , . . ., X n be independent random vectors in R d such that • Every X i is symmetrically distributed, and • With probability one, every d-tuple of vectors are linearly independent.
Then the probability that convex hull of X 1 , X 2 , . . ., X n contains the origin [in its interior] is given by The proof of the theorem in [21] relies on the assumption that the hyperplanes {X ⊥ i , i = 1, . . ., n} in R d are in general position with probability one.The main purpose of this note is to provide a matching statement for a class of random vectors with i.i.d components, allowing for discrete distributions.
Theorem 1.2.For every K ≥ 1 there is c > 0 depending only on K with the following property.Let ξ be a symmetric random variable of unit variance, and with a subgaussian 1 moment bounded above by K. Let n > d, and let X 1 , X 2 , . . ., X n be random vectors in R d with i.i.d coordinates equidistributed with ξ.Then p n,d ≤ P conv{X 1 , . . ., X n } contains the origin ≤ p n,d + 2 exp(−cd), and p n,d − 2 exp(−cd) ≤ P conv{X 1 , . . ., X n } contains the origin in the interior ≤ p n,d .
Remark 1.3.Theorem 1.2 shows that linear dependencies within some d-subsets of vectors X 1 , . . ., X n which may occur with high probability for some discrete distributions, cause at most exponentially small correction to Wendel's formula.Note that Theorem 1.2 identifies n d = 2 ± O(d −1/2 ) as the window in which transition from "does not contain the origin w.h.p" to "contains the origin w.h.p" occurs.
Fix parameters n and d, a non-zero cost vector c, and consider a random linear program of the form (1) max x, c subject to Ax ≤ 1, where 1 is the n-dimensional vector of ones, and A is an n × d random matrix with i.i.d rows.Computational complexity of the shadow simplex method for linear programs of this form were studied by Borgwardt [1,2,3] (see also [4,8] for a detailed literature overview including more recent developments).Importantly, the row distributions in [1,2,3] are assumed to be rotationally invariant, which, in particular, implies that with probability one all entries of A are non-zero and, more generally, non-integer.While much more powerful, the setting of smoothed linear programs introduced by Spielman and Teng [16] has the same characteristics in regard to the matrix density and non-integrality.It is of significant interest to explore linear programming in the randomized setting allowing for some degree of sparsity (in particular, analysis of the simplex method under zero-preserving perturbations is listed in [16] among future research directions).Consider a basic question: under what conditions on the distributions of the matrix rows and parameters d, n the program (1) is bounded w.h.p?In the setting of continuous distributions, [21] provides a complete answer: Corollary 1.4 (Corollary of the result of [21], see Remark 2.3 below).Let n ≥ d, and let A be an n × d random matrix with independent rows such that • Every row is symmetrically distributed, and • For any fixed hyperplane H and i ≤ n, the probability that row i (A) belongs to H is zero.Then for any non-zero cost vector c, Note that, just as with the probability of absorbing the origin, transition from "linear program (1) is unbounded w.h.p" to "(1) is bounded w.h.p" occurs in the window n d = 2 ± O(d −1/2 ).As a variation of our main result, we get a matching estimate for arbitrary symmetric distributions with i.i.d components: Theorem 1.5.For any K ≥ 1 there is c > 0 depending only on K with the following property.Let n ≥ d, and let A be an n × d random matrix with i.i.d symmetrically distributed entries of unit variance and subgaussian moment bounded above by K. Then for any non-zero cost vector c, Remark 1.6.Among standard examples of non-Gaussian distributions satisfying the assumptions of Theorems 1.2 and 1.5 are Rademacher and Bernoulli-Gaussian.In particular, assume that the entries of A are equidistributed with a Bernoulli-Gaussian variable of the form b g, where b and g are independent, b ∼ Bernoulli(p), g ∼ N (0, 1), and the parameter p is a small positive constant.The corresponding optimization problem (1) can be viewed as a simplest randomized model of a linear program with a diluted coefficient matrix.
1.1.Proof overview.A quick inspection shows that Theorems 1.2 and 1.5 are closely related (see Subsection 2.1 for details).At this point we discuss a "direct" proof of Theorem 1.2 since it is somewhat less technical than the study of the random linear programs, and at the same time conveys all the main ideas.Let X 1 , . . ., X n be i.i.d random vectors in R d with i.i.d coordinates having symmetric distribution, unit variance, and a bounded subgaussian moment.An approximation argument (see Subsection 2.1) shows that in order to prove Theorem 1.2 it is sufficient to show that P conv{X 1 , . . ., X n } contains the origin on its boundary is exponentially small in d.Disregarding the scenario dim conv{X 1 , . . ., X n } ≤ d − 1 (the probability of that event is shown to be exponentially small in d by applying standard results of non-asymptotic random matrix theory), the problem is reduced to computing the probability that 0 ∈ conv{X i 1 , . . ., X i d } for some d-tuple X i 1 , . . ., X i d of affinely independent vertices such that conv{X i 1 , . . ., X i d } lies on the boundary of the polytope (i.e is a facet or a subset of a facet).It is not difficult to check that conditioned on the event that 0 is contained in the affine hyperlane spanned by X i 1 , . . ., X i d and that no proper subset of the vectors are linearly dependent, the [conditional] probability of 0 ∈ conv{X i 1 , . . ., X i d } equals 2 1−d .The assumption that conv{X i 1 , . . ., X i d } is a subset of a facet also implies that for a vector y orthogonal to conv{X i 1 , . . ., X i d }, all inner products y, X j , j = i 1 , . . ., i d , are either all non-negative or all non-positive.In an ideal situation when P{ X j , y = 0} = 0, this would supply an extra factor 2 −n+d+1 to our probability estimate.Overall, that would imply that for fixed i 1 , . . ., i d , the probability that conv{X i 1 , . . ., X i d } (a) is (d − 1)-dimensional, (b) is a subset of a facet and (c) contains the origin, could be estimated from above by where the factor exp(−c ′ d) is an upper bound on the probability that the affine span of vectors X i 1 , . . ., X i d contains the origin.It is not difficult to see that for arbitrary n > d, and hence the estimate survives crude union bound over all unordered dtuples of indices, producing the required result.The above sketch ignores the fact that degenerate cases (possible linear dependencies among X i 's) may occur with a positive probability.The essence of the proof of Theorem 1.2 is in carefully accounting for those degenerate cases and making sure that they do not destroy the probability estimates.
Notation.The spectral norm of a matrix B will be denoted by B .The unit Euclidean sphere in R h is denoted by S h−1 .Given a vector x ∈ S h−1 , we write xχ [h−1] for the (h − 1)dimensional vector obtained from x by deleting its h-th component.
Throughout the note, for a set S ⊂ R d the interior of S is the set S \ ∂S where ∂S is the boundary of the set defined w.r.t the standard topology in R d .Further, the relative interior of S is the subset S \ ∂S H where H is the affine linear span of S and ∂S H is the [relative] boundary of S defined w.r.t the standard topology in H.

Preliminaries
2.1.Reductions.The first step in proving Theorems 1.2 and 1.5 is the observation that an approximation argument, together with Wendel's formula, yield the following one-sided estimates.
Proposition 2.1.Let n > d and let Y 1 , . . ., Y n be independent random vectors in R d having symmetric distributions.Then P conv{Y 1 , . . ., Y n } is d-dimensional and contains the origin in its interior We refer to [10, Proposition 2.12] for a proof of a related statement which, with minor changes, also verifies Proposition 2.1.
As a corollary of the proposition, we get Proposition 2.2.Let n ≥ d, and let A be an n × d random matrix with independent symmetrically distributed rows X 1 , . . ., X n .Then for any non-zero cost vector c, . ., X n , c} contains the origin on its boundary .
Proof.Let s be a symmetric sign variable independent from A. Observe that, in view of symmetrical distributions of X i 's, the linear program is bounded with the same probability as linear program (1), and P conv{X 1 , . . ., X n , −sc} contains the origin on its boundary = P conv{X 1 , . . ., X n , c} contains the origin on its boundary .

Note that
. ., X n , −s c} contains the origin in the interior , and that Applying Proposition 2.1, we get . ., X n , −s c} contains the origin on its boundary , and, similarly, . ., X n , −s c} contains the origin on its boundary .
The result follows.
Remark 2.3.Under the assumptions P{X i ∈ H} = 0, 1 ≤ i ≤ n for every fixed hyperplane H, the last proposition implies that for any non-zero cost vector c, which yields Corollary 1.4.
Remark 2.4.Assume that for every n ≥ d and for X 1 , . . ., X n as in Proposition 2.2, we are able to show that P conv{X 1 , . . ., X n , c} contains the origin on its boundary ≤ 2 exp(−cd) for some c > 0 depending only on K.In view of Proposition 2.2, this would imply the statement of Theorem 1.5.Moreover, by replacing c with a random vector X n+1 independent from and equidistributed with X 1 , . . ., X n , we would also obtain Theorem 1.2.
Lemma 2.6 (Norms of subgaussian matrices; see, for example, [18,Section 4.4]).For every K ≥ 1 and R > 0 there is C 2.6 > 0 depending only on K and R with the following property.Let A be an N ×n random matrix with i.i.d entries of zero mean, unit variance, and with subgaussian moment bounded above by K. Then

Sparse and compressible vectors.
Definition 2.7 (Sparse vectors).Given m ≤ h, a vector y ∈ R h is m-sparse if it has at most m non-zero components.
Definition 2.8 (Compressible vectors [12,14]).Given parameters δ, ρ > 0, define Comp h (δ, ρ) as the set of all unit vectors y in R h such that the Euclidean distance of y to the set of δh-sparse vectors is at most ρ.The vectors from Comp h (δ, ρ) are called (δ, ρ)-compressible.
Lemma 2.9.For every K, R ≥ 1 there are c 2.9 , β 2.9 , υ 2.9 > 0 depending only on K, R with the following property.Let n, d ≥ 1 satisfy d ≤ Rn, and let B = (B ij ) be a d × (n + 1) matrix, where the entries Proof.We start by observing that there is c 1 ∈ (0, 1) depending only on K such that for every m ≥ 1, every unit vector ỹ = (ỹ 1 , . . ., ỹm ) ∈ R m , and i.i.d variables ξ 1 , . . ., ξ m equidistributed with ξ, we have (the claim can be verified by noting that the random variable m i=1 ỹi ξ i is symmetric, of unit variance, and with subgaussian moment of order O(K); for the last assertion see, for example, [18, Section 2.6]).Further, set C := C 2.6 (K, 1), so that In what follows, we assume that δd ≥ 1.
and hence Thus, In view of the above observation, the event Probability of the latter can be estimated via the union bound argument and our choice of the constants by
Remark 2.12.Note that the event There is y ∈ Comp d (δ, ρ) such that row i (A), y ≤ 0 for all i ≤ n can be interpreted as the event that there is a separating hyperplane H for the random polyhedron P = conv{row i (A), i ≤ n} such that H passes through the origin and the unit normal vector to H is (δ, ρ)-compressible.
Definition 2.19 (Level sets, [15]).Given any number D ≥ c 0 √ h, define We will write S D (α) when the dimension is clear from the context.
Remark 2.21.[15, Lemma 4.6] is stated under different assumptions on the matrix dimensions, however, the proof of the lemma in [15] works under our conditions on v and h as well.
The next lemma is based on standard arguments which can already be found in [12].We provide a proof for completeness.As before, the non-random column of the random matrix in the lemma is introduced to deal with the cost vector c.Proof.We will assume that m ≥ C 2.14 (δ), so that for any (δ, ρ)-incompressible vector y ∈ S m−1 , so that, in view of Lemma 2.6, Fix for a moment any t ∈ (0, 1/2], and let N ⊂ Incomp m (δ, ρ) be a 2t-net on the set of (δ, ρ)-incompressible vectors of size |N | ≤ (1 + 2/t) m .In view of our definition of c 0 and Lemma 2.17, for every vector Applying Lemma 2.20 (with B obtained from B by removing the m-th column), we have for every y ∈ N , and hence P Bx = 0 for some (δ, ρ)-incompressible vector x, and B ≤ L √ v

This implies
P Bx = 0 for some (δ, ρ)-incompressible vector x Assuming that m is sufficiently large and choosing t such that Lemma 2.24 (Null incompressible vectors with small LCD).For any M > 0 there are κ 2.24 ∈ (0, c 0 ] and C 2.24 ≥ C 2.14 , c 2.24 > 0 depending only on K, M with the following property. • Let h ≥ C 2.24 , and let B = (B ij ) be a (h − 1) × h random matrix, where the entries has probability at most exp(−M h).• Let h ≥ C 2.24 , and let B be a (h − 1) × h random matrix with i.i.d entries of zero mean, unit variance, and subgaussian moment bounded above by K. Then the event There is a (δ, ρ)-incompressible null unit vector x for B with LCD κ 2.24 √ h,γ 0 (x) ≤ exp(c 2.24 h) has probability at most exp(−M h).
Proof.We will only prove the first (slightly more technical) part of the lemma; the proof of the second part is very similar.In view of Lemma 2.14, for every (δ, ρ)-incompressible vector x, xχ Fix for a moment any

and consider the event
Let N D be a (4α/D)-net on S D (α) of cardinality at most (1 + 9D/ √ h − 1) h−1 (the net exists according to Lemma 2.23), and define a discrete subset N of R as follows:

We claim that
Indeed, fix any realization of B from E D , and let x be a vector from the definition of • D 4α , otherwise, and observe that u x ∈ N and that Further, let y be an element of

and the claim follows.
To estimate the probability of E D , we apply the claim together with Lemma 2.20.Specifically, Note that for κ sufficiently small, for D ≤ exp(c 2.20 κ 2 (h−1)) and for h greater than a large enough constant (depending on κ, C 2.20 , M ), the last quantity can be made smaller than exp(−2M h).
Taking the union bound over the events , we get P There is a (δ, ρ)-incompressible null unit vector x for B with LCD α,γ 0 (xχ and the result follows.
3. Proof of Theorems 1.2 and 1.5 In this section, we assume the same choice for parameters δ, ρ and c 0 , γ 0 as in Subsection 2.4.
24 , c 2.24 > 0, κ = κ 2.24 be as in Lemma 2.24, and let C 2.18 , c 2.18 > 0 be as in Theorem 2.18 (with parameters K and γ 0 ).Assume that d ≥ m ≥ C 2.24 .Let ξ be a symmetric random variable of unit variance, and subgaussian moment bounded above by K. Further, let X 1 , . . ., X m−1 be i.i.d random vectors in R d with independent components equidistributed with ξ, let X be a random vector in independent from {X 1 , . . ., X m−1 }, and denote by W the d × m random matrix with columns X 1 , . . ., X m−1 , X. Then the event W has rank m − 1, and there is a (δ, ρ)-incompressible unit vector λ = (λ 1 , . . ., λ m ) ⊤ with strictly positive components such that W λ = 0 has probability at most Remark 3.2.The above lemma is formulated so as to deal with both the case when X is the non-random cost vector and when X is one of the random rows of the coefficient matrix A. For that reason, we allow X to be random but do not place restrictions on its distribution.
Proof.Without loss of generality, we can assume that the vector X is constant.Denote the event in question by E. First, consider an auxiliary event and there is a (δ, ρ)-incompressible unit vector λ = (λ 1 , . . ., λ m ) ⊤ with non-zero components such that W λ = 0 .
Assume that the probability of E ′ is non-zero.We can construct a mapping Q : where λ is the (δ, ρ)-incompressible unit vector with λ m > 0 satisfying W λ = 0 (note that at every point of E ′ the vector is uniquely determined because of the assumptions on the matrix rank).Conditioned on E ′ , Q assigns equal probabilities to all of the 2 m−1 admissible ±1 vectors, which follows from the assumption that X i 's are symmetrically distributed.Thus, and it is sufficient to estimate the probability of E ′ .Set α := κ 2.24 √ m − 1.For every subset J ⊂ [d] of size m − 1, let W J be the (m − 1) × m submatrix of W obtained by selecting rows with indices in J, and let E J and ẼJ be the events E J := W J has rank m − 1, and there is a (δ, ρ)-incompressible vector λ = (λ 1 , . . ., λ m ) First, we observe that Further, note that in view of Lemma 2.24, the probability of each ẼJ is bounded above by exp(−M m).Next, conditioned on any realization of W J of rank m − 1, a unit vector λ such that W J λ = 0 is uniquely determined up to a sign, allowing to decouple the event {W J λ = 0} with the events { row i (W ), λ = 0}, i / ∈ J. Specifically, we can write for each J ⊂ [d] with |J| = m − 1: P(E J ) ≤ P W J has rank m − 1, and there is a (δ, ρ)-incompressible where the supremum is taken over all unit (δ, ρ)-incompressible vectors y with for some c ′ > 0 depending only on K. Denote P ′ d := conv{X 1 , . . ., X n , X n+1 }.Set M := 5 and R := 3. We will assume that the parameters δ, ρ, γ 0 > 0 are as in the previous section, that α = κ 2.24 √ d and C 2.24 , c 2.24 are as in Lemma 2.24 (with our choice of M ).We will further assume that d ≥ max(C where E 1 := P ′ d has dimension at most d − 1 and contains the origin , E 2 := There is a (δ, ρ)-compressible unit vector y ∈ R d such that X i , y ≤ 0 for all i ≤ n + 1 , E 3 := There are i 1 < • • • < i d so that conv{X i 1 , . . ., X i d } contains the origin, and a (δ, ρ)-incompressible vector y orthogonal to X i 1 , . . ., X i d , satisfies LCD α,γ 0 (y) ≤ exp(c 2.24 d) , and for every E J,I := X i , i ∈ I are affinely independent, conv{X i , i ∈ I} is on the boundary of P ′ d ; conv{X i , i ∈ J} contains the origin in its relative interior, and a unit vector y orthogonal to X i , i ∈ I, satisfies LCD α,γ 0 (y) ≥ exp(c 2.24 d) .
Indeed, consider any point ω of the probability space at the intersection of (E 1 ∪ E 2 ∪ E 4 ) c and the event that P ′ d contains the origin on its boundary.Necessarily, at ω there are d affinely independent vertices X i 1 , . . ., X i d so that conv{X i 1 , . . ., X i d } is part of the boundary of P ′ d and contains 0. Let y be a unit vector orthogonal to X i 1 , . . ., X i d , and note that, being in the complement of E 2 , we get that y is (δ, ρ)-incompressible.If LCD α,γ 0 (y) ≤ exp(c 2.24 d) then ω ∈ E 3 .Otherwise, set I := {i 1 , . . ., i d }, and let J be a subset of I such that 0 is contained in the relative interior of conv{X i , i ∈ J} (note that J exists and has size more than √ d, since ω ∈ E c 4 ).Then, by the above definition, ω ∈ E J,I , and the claim is verified.Next, we estimate probabilities of the above events.
The probability of E 1 can be bounded above by the probability of the event that the smallest singular value of the d × n matrix with columns X i , 1 ≤ i ≤ n − 1, X n − X n+1 , is zero.The matrix can be viewed as a non-random shift (with the shift matrix of spectral norm √ d) of the centered d × n matrix with i.i.d subgaussian entries of unit variance.Invertibility of such matrices has been extensively studied in the literature; for our purposes it suffices to apply the result of [13] to infer P(E 1 ) ≤ 2 exp(−c 1 n), where c 1 > 0 may only depend on K.
Further, applying Lemma 2.11, we get P(E 2 ) ≤ P There is a (δ, ρ)-compressible unit vector To estimate the probability of E 3 , we apply the second part of Lemma 2.24 and a union bound argument.Observe that, in view of equidistribution of X 1 , . . ., X n , P There is a (δ, ρ)-incompressible vector y orthogonal to X 1 , . . ., X d−1 , with LCD α,γ 0 (y) ≤ exp(c 2.24 d) , and hence, by Lemma 2.24 and our choice of M , P E 3 ≤ 2 3d exp(−M d) < exp(−d).
A standard concentration estimate for the Euclidean norm of random vectors with i.i.d subgaussian components of unit variance (see, for example, [18, Section 3.1]) implies for some c 5 > 0 depending only on K.
It remains to estimate the probabilities of the events E J,I \ (E 4 ∪ E 5 ).Let I ⊂ [n + 1] be any fixed d-subset, and let J ⊂ I satisfy |J| ≥ √ d.Denote by W J the random d × |J| matrix with columns X i , i ∈ J (we will arrange the column vectors according to their indices), and let i max be the largest element of J.Note that the matrix W = W J satisfies the assumptions of Lemma 3.1 with X = X imax (our intention here is to treat both cases i max = n + 1, when X imax is constant, and i max = n + 1, when X imax is equidistributed with the rest of the vectors X i , i ∈ J, in a uniform way, without splitting the proof into subcases).Observe that E J,I \ (E 4 ∪ E 5 ) ⊂ W J has rank |J| − 1, the Euclidean norm of X imax is in the range and there is a (δ, ρ)-incompressible unit vector λ = (λ 1 , . . ., λ |J| ) ⊤ with strictly positive components such that W J λ = 0 ∩ X i , i ∈ I are affinely independent, conv{X i , i ∈ I} is on the boundary of P ′ d , and a unit vector y orthogonal to X i , i ∈ I, satisfies LCD α,γ 0 (y) ≥ exp(c 2.24 d) .
Note that the vector y in the description of the last event is uniquely determined by X i , i ∈ I up to a sign, and that the requirement that conv{X i , i ∈ I} is [a subset of] a facet of P ′ d implies that the inner products y, X i , i / ∈ I, are either all non-positive or all non-negative.The σ(X i , i ∈ I)-measurability of {±y} then allows to condition on any admissible realization of X i , i ∈ I, and ±y and use the conditional independence and identical distribution of y, X i , i / ∈ I ∪ {n + 1}, to get, in view of Lemma 3.For ℓ not satisfying (3), we shall apply Lemma 2.22.Again, let i max be the largest index of J.
Observe that and there is a (δ, ρ)-incompressible vector λ satisfying W J λ = 0 , and that, whenever ℓ does not satisfy (3), necessarily d/ℓ ≥ 1 + s for some s > 0 depending only on K. Applying Lemma 2.22 with this choice of s and with M = 5, we obtain P E J,I \ (E 4 ∪ E 5 ) ≤ exp(−5d),

Lemma 2 .
22.For every s > 0 and M > 0 there is C 2.22 > 0 depending on s, M with the following property.Let v/(1 + s) ≥ m ≥ C 2.22 , and let B = (B ij ) be an v × m matrix such that the entries1 ≤ i ≤ v, 1 ≤ j ≤ m − 1, are i.i.d of zero mean, unit variance, and subgaussian moment bounded above by K, and the m-th column of B is non-random of Euclidean norm in the interval [ √ v/2, 2 √ v].Then P Bx = 0 for some (δ, ρ)-incompressible vector x ≤ exp(−M v).
[18,rk 2.10.The proof of the above lemma is a standard application of the ε-net argument; see, for example, proof of [14, Lemma 3.3], as well as[18, Section 4.6].The lemma will be applied with B = [A ⊤ , c], where A is the coefficient matrix of the linear program (1), and c is an appropriately normalized cost vector.
of zero mean, unit variance, and with subgaussian moment bounded above by K, and the (n + 1)-st column of B is non-random of Euclidean norm √ d.Then with probability at least 1 − 2 exp(−c 2.9 d), we have Bx = 0 for every (β 2.9 , υ 2.9 )-compressible vector x.Lemma 2.11.For every K ≥ 1 there are c 2.11 , δ 2.11 , ρ 2.11 > 0 depending only on K with the following property.Let ξ be a symmetric random variable of unit variance with the subgaussian moment bounded above by K. Let n ≥ d ≥ 1, and let A be an n × d random matrix with i.i.d entries equidistributed with ξ.Then P There is y ∈ Comp d (δ 2.11 , ρ 2.11 ) such that row i (A), y ≤ 0 for all i ≤ n ≤ 2 exp − c 2.11 n .